An Empirically Based Approach Towards a System of Semantic Features
نویسنده
چکیده
A major problem in a~chine translation is the semantic description of lexical units which should be based on a semantic system that is bot]l coherent and operationalized to the greatest possible degree° This is to guarantee consistency between lexical unii~ coded by lexicogral~lers. This article introduces a generating device for achieving wellformed semantic feature expressions. I. T ntention and procedure Empirical ~xk with the verbs of the ES~rpus as well as experience in theoretical semantics, and last but not least, the consulting of the semantic feature inventories of other machine translation systems (MII.~AL, JAPAN, SYSTRAN, SUSY) l~ns resulted in the necessity of an elaboration of the proposal for semantic features made in EIB-3 (EUROTRA LINGUI~YfC SPECIFICATIONS). These feature inventories as well as a large amount of already existing, partly fairly traditional work on semantic feature systems of linguistics and philosophy (CHAFE, FRIEDRICH, PAIMER, VENDLER) , of information sciences (DAI~BEI~), and work in the field of cog= nitive linguistics and artificial intelligence (G.Ao MI~, G.A. MI~ & P.N. JOHNSON-IAIRD, G. LAKOFF, R. LANGACKER, B. COHEN, W.R. GARNER, ATINEAVE, FRED~IKSEN, MINSKY, (~N/AK, WINOGRAD, ANDERSON & BOWER, WOODS) and last but not least scme recent issues on word semantics (T. BAIIMER, W. BRENNENS~IHL, J.BALLWEG, H. FROSCH) have been taken into account in order to meet the requiremerits of a manageable syst~n of semantic features. This system is intended both to be based on a sensible theory of semantics and to satisfy the special requirements of machine translation in general and of our t~t type in particular. Moreover, it should be flexible enough to be enlarged and supplemented or changed, whenever this proves necessary on empirical evidence = this last requir~rent being made possible by the accc~plishment of the first. In tmying to meet the~ requirea~nts the semantic feature inventories ~t hand have been enlarged, changed, and adapted to our specific purposes m~d have been merged into one system of semantic features. 2. Comment on the theoretical asslmi0tions made in different machine translation systems with respect to the semantic representation with respect to the semantic representation which in EUROTRA will be implemented on the interface structure (IS) level of the source and target language it is our first and foremost aim to arrive at a coherent system of semantic features. In order not to start from nothing the above mentioned feature inventories have l~en consulted. The feature inventories developed fer these machine translation systems have different shortcomings which will be briefly comm~%ted on in the following. since a sufficient def:hnition of how to interpret the features is given J/% none of the proposals of the above mentioned mac/line translation systems, we will not ccm~ent here on the ~eatures the~nselves included in the propo~ils. A brief ccsment, however, is necessary on 'the general approach, which seems to imply theoretical assumptions (not explicitly mentioned, since neither a theoretical nor a practical usage-based ~101anation is given) about the organisation and processing of semantic units, for which there is no e~pirical evidence: neither natural language processing by human beings nor efficiency in automatic processing of natural language gives support to i:~ese implied assumptions. It must be mentioned, however, that this can by no means be considered to be an objective comment, since for an outsider, it is ~/~oossible to urgerstand the systematic aDtivation of these feature inventories for at least one of the following reasons: The semantic features are not defined or at least not sufficiently defJmed in order to n~ke clear their conceptual structure and thus to make clear how they are meant to be used. This is especially true for the EUROTRA proposal, in ~lich semantic features are not defined at all. rfhis is, however, only a proposal, which has not been applied yet, but is being tested at the. mcment. But also the SYSTRAN semantic features, as well as those of JAPAN, which have been worked out rather sophistieatedly, are not cc~_nted on. The semantic features of METAL are defined, their definition, however, remains rather vague. Even when taking into consideration the ~les which are added, the reader does not arrive at a satisfactory understa~. The dependencies holding between features are. not explained. This is especially true for SYSTRAN, which only gives a list of features z~ferring to arguments. A hierarchical system consisting of two levels of semantic features is defined by METAL, which is far frc~ sufficient. JAPAN J s worked out in a more sophisticated way with respect to this problem. Both in METAL and in JAPAN, however, relations between the dominating features are not defined. The ~ proposal gives an enumeration on the second and lowest le~ vel of the feature tree, which is just a conglon~ration of semantic information, which should be described at different levels~ in order to achieve the overall aim of linguistically consistent semantic description. 3. A proposal for a EUROTRA semantic feature rule system 3. i. Necessity of a semantic feature rule system Let us now put forward our conception of the ~o system~ ef ~tic features with respect to its fom~alization. We have two gr~, one describing "SITUATION" features, the other one describing "~TITY" features. Neither of the two systems is strictly hierarchically organized. The hierarchical principle, however, which always defines a refinement of the doafinat~/~g feattzt~, prevails. Particularly the most general semantic features, such as the "ENTITY" features "CONC~ETE/"ABSTRACT", "CC~JNTABLE"/'9~ASS", and "NAIURAL"/"ARTIFICIAL", and the "SITUATION" features "OONCREI~"/"ABSTRAC~", "STATIVE"/"DYNAMIC", and "H/NCIUAL"/ "DURATIVE"/"ITERATIVE", respectively, form pairs or triplets of semantic features. One feature of each of these alternations obligatorily occurs, and the descendents, which specify them, form disjunct sets.2 3.2. The basic formalis~a Let us now comment informally on our present conception of how the semantic features which we consider necessary so far are related to each ot2ner. We use three operations holding between semantic features in our ¢/rammar: l) Hierarc/Iy is the overall relation defining the derivation of the features. 2) Alternation relates a set of features, only one of which applies. 3) ~ relates semantic features obligatorily occurrJ/~g together. This type of ralationship is of course in the minority. The basic idea is to describe these relations by a context-free ~ule system, where the rules can for example be of the following, folul~: (3. i) X = (A/B)* (C/D) The hier~ly here is repz~sented by t/~e sign "=", the alternation by the sign "/", and the disjunction by the sign "*". The interpretation of the rule is the following: The feature on the left handside of the rtlle dc~ minates the features appearing o11 the right handside° A, B, C, and D establish a refinement of X. Mere precisely, in this example X is specified by a pair of features, the first ~nent of which can be either A or B and the .~econd is either C or Do ~he subordinate feattlre~ on the right llandside of the tulle can get supe~rdinate feature~ themselve~ o11 the next level lower dowel in the hie/-archy. ~e terr~inal featuz~s~ t/iat is those features which are not defined for accepting a/~y ~ulbordinat~ features, are rep ~resented by the rules
منابع مشابه
Towards constructing an Integrative, Multi-Level Model for Cognition: The Function of Semantic Networks
Integrated approaches try to connect different constructs in different theories and reinterpret them using a common conceptual framework. In this research, using the concept of processing levels, an integrated, three-level model of the cognitive systems has been proposed and evaluated. Processing levels are divided into three categories of Feature-Oriented, Semantic and Conceptual Level based o...
متن کاملAutomatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach
In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...
متن کاملبرچسبزنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه
Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...
متن کاملSemantic Preserving Data Reduction using Artificial Immune Systems
Artificial Immune Systems (AIS) can be defined as soft computing systems inspired by immune system of vertebrates. Immune system is an adaptive pattern recognition system. AIS have been used in pattern recognition, machine learning, optimization and clustering. Feature reduction refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encoun...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1986